# 128k ultra-long context

Qwen3 128k 30B A3B NEO MAX Imatrix Gguf
Apache-2.0
GGUF quantized version based on Qwen3-30B-A3B Mixture of Experts model, extended to 128k context, optimized with NEO Imatrix quantization technology, supporting multilingual and multitask processing.
Large Language Model Supports Multiple Languages
Q
DavidAU
17.20k
10
Qwen3 32B 128k NEO Imatrix Max GGUF
Apache-2.0
This is the NEO Imatrix quantized version of the Qwen3-32B model, using BF16 format to maximize output tensors for improved inference/generation capabilities, supporting a 128k context length.
Large Language Model
Q
DavidAU
1,437
2
Qwen3 32B 128k HORROR Imatrix Max GGUF
Apache-2.0
A horror-themed text generation model optimized based on Qwen3-32B, enhanced with Imatrix quantization technology for improved reasoning, supporting 128k ultra-long context
Large Language Model
Q
DavidAU
1,963
2
Mistral Small 3.1 24B Instruct 2503 MAX NEO Imatrix GGUF
Apache-2.0
A 24B parameter instruction-tuned model by Mistralai, supporting 128k context length and multilingual processing, enhanced with Neo Imatrix technology and MAX quantization scheme
Large Language Model Supports Multiple Languages
M
DavidAU
38.29k
31
Gemma 3 12b It MAX HORROR Imatrix GGUF
Apache-2.0
A horror-style instruction-tuned version based on Google's Gemma-3 model, featuring Neo Imatrix technology and extreme quantization, supporting 128k context length
Large Language Model
G
DavidAU
5,072
13
Llama 3.3 70b Instruct Awq
Llama 3.3 is a multilingual large language model developed by Meta, with 70 billion parameters, optimized for multilingual dialogue use cases, and demonstrates excellent performance in multiple benchmarks.
Large Language Model Transformers Supports Multiple Languages
L
casperhansen
47.12k
32
Llama 3.2 3B Instruct NEO SI FI GGUF
Apache-2.0
A 3B-parameter instruction-tuned model based on the Llama-3.2 architecture, incorporating the NEO IMATRIX sci-fi dataset, supporting 128k long-context generation
Large Language Model Supports Multiple Languages
L
DavidAU
725
8
Llama 3.1 405B FP8
Meta Llama 3.1 is a multilingual large language model collection, including 8B, 70B, and 405B parameter pre-trained and instruction-tuned generative models, supporting 8 languages with outstanding performance on industry benchmarks.
Large Language Model Transformers Supports Multiple Languages
L
meta-llama
540
115
Llama 3.1 405B Instruct FP8
Meta Llama 3.1 is a multilingual large language model series, including pre-trained and instruction-tuned generative models with 8B, 70B, and 405B scales. The 405B version is optimized for multilingual dialogue scenarios and performs excellently in common industry benchmarks.
Large Language Model Transformers Supports Multiple Languages
L
meta-llama
7,406
188
Llama 3.1 70B Instruct
Meta Llama 3.1 is a set of pretrained and instruction-tuned generative models with 8B, 70B, and 405B parameters, optimized for multilingual conversation scenarios, supporting 8 languages and code generation.
Large Language Model Transformers Supports Multiple Languages
L
meta-llama
1.2M
806
Llama 3.1 405B
LLaMA 3.1 is a multilingual large language model series released by Meta, available in 8B, 70B, and 405B sizes, supporting 8 languages, with outstanding performance in industry benchmarks.
Large Language Model Transformers Supports Multiple Languages
L
meta-llama
19.20k
927
Llama 3.1 70B
Meta Llama 3.1 is a large language model series supporting 8 languages, available in 8B/70B/405B scales, outperforming most open-source and proprietary chat models in industry benchmarks
Large Language Model Transformers Supports Multiple Languages
L
meta-llama
97.35k
358
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase